22 research outputs found

    Sequencing, Analysis, and Annotation of Expressed Sequence Tags for \u3ci\u3eCamelus dromedarius\u3c/i\u3e

    Get PDF
    Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF\u3e300 bp and ~40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism

    Genomic Analysis of Intrinsically Disordered Proteins in the Genus Camelus

    No full text
    Intrinsically disordered proteins/regions (IDPs/IDRs) fail to fold completely into 3D structures, but have major roles in determining protein function. While natively disordered proteins/regions have been found to fulfill a wide variety of primary cellular roles, the functions of many disordered proteins in numerous species remain to be uncovered. Here, we perform the first large-scale study of IDPs/IDRs in the genus Camelus, one of the most important mammalians in Asia and North Africa, in order to explore the biological roles of these proteins. The study includes the prediction of disordered proteins/regions in Camelus species and in humans using multiple state-of-the-art prediction tools. Additionally, we provide a comparative analysis of Camelus and Homo sapiens IDPs/IDRs for the sake of highlighting the distinctive use of disorder in each genus. Our findings indicate that the human proteome is more disordered than the Camelus proteome. Gene Ontology analysis also revealed that Camelus IDPs are enriched in glutathione catabolism and lactose biosynthesis

    On the Prevalence and Potential Functionality of an Intrinsic Disorder in the MERS-CoV Proteome

    No full text
    Middle East respiratory syndrome is a severe respiratory illness caused by an infectious coronavirus. This virus is associated with a high mortality rate, but there is as of yet no effective vaccine or antibody available for human immunity/treatment. Drug design relies on understanding the 3D structures of viral proteins; however, arriving at such understanding is difficult for intrinsically disordered proteins, whose disorder-dependent functions are key to the virus’s biology. Disorder is suggested to provide viral proteins with highly flexible structures and diverse functions that are utilized when invading host organisms and adjusting to new habitats. To date, the functional roles of intrinsically disordered proteins in the mechanisms of MERS-CoV pathogenesis, transmission, and treatment remain unclear. In this study, we performed structural analysis to evaluate the abundance of intrinsic disorder in the MERS-CoV proteome and in individual proteins derived from the MERS-CoV genome. Moreover, we detected disordered protein binding regions, namely, molecular recognition features and short linear motifs. Studying disordered proteins/regions in MERS-CoV could contribute to unlocking the complex riddles of viral infection, exploitation strategies, and drug development approaches in the near future by making it possible to target these important (yet challenging) unstructured regions

    Sequencing, Analysis, and Annotation of Expressed Sequence Tags for \u3ci\u3eCamelus dromedarius\u3c/i\u3e

    Get PDF
    Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF\u3e300 bp and ~40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism

    Microsatellite Variation in the Most Devastating Beetle Pests (Coleoptera: Curculionidae) of Agricultural and Forest Crops

    No full text
    Weevils, classified in the family Curculionidae (true weevils), constitute a group of phytophagous insects of which many species are considered significant pests of crops. Within this family, the red palm weevil (RPW), Rhynchophorus ferrugineus, has an integral role in destroying crops and has invaded all countries of the Middle East and many in North Africa, Southern Europe, Southeast Asia, Oceania, and the Caribbean Islands. Simple sequence repeats (SSRs), also termed microsatellites, have become the DNA marker technology most applied to study population structure, evolution, and genetic diversity. Although these markers have been widely examined in many mammalian and plant species, and draft genome assemblies are available for many species of true weevils, very little is yet known about SSRs in weevil genomes. Here we carried out a comparative analysis examining and comparing the relative abundance, relative density, and GC content of SSRs in previously sequenced draft genomes of nine true weevils, with an emphasis on R. ferrugineus. We also used Illumina paired-end sequencing to generate draft sequence for adult female RPW and characterized it in terms of perfect SSRs with 1–6 bp nucleotide motifs. Among weevil genomes, mono- to trinucleotide SSRs were the most frequent, and mono-, di-, and hexanucleotide SSRs exhibited the highest GC content. In these draft genomes, SSR number and genome size were significantly correlated. This work will aid our understanding of the genome architecture and evolution of Curculionidae weevils and facilitate exploring SSR molecular marker development in these species

    Sequencing, analysis, and annotation of expressed sequence tags for Camelus dromedarius.

    Get PDF
    Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and approximately 40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism

    Numbers of unique GeneIDs, GO Terms that are mapped by the ESTs with hits for the nine species analyzed.

    No full text
    <p>Numbers of GeneIDs, GO Terms, and GeneIDs that have a GO annotation are shown for the nine species analyzed, where applicable. For each camel sequence group (contig, singleton, and combination of the two), number of unique GeneIDs that are “hit” by BLAST analyses are shown. Where applicable, we also show number of GO terms mapped by the GeneIDs that got hit and number GeneIDs among this list that have a mapped GO term.</p

    BLAST results for contigs, singletons, and their combination shown separately for the nine species analyzed.

    No full text
    <p>Percentage of sequences that got a hit to the total number of sequences in each group (contig, singleton, or combined) is shown separately for each species. For the sequences that got a hit, average ORF length and the percentage of sequences with ORF >300 bp (to the total number of sequences that got a hit) is shown for each group and species.</p
    corecore